-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-2872][HUDI-2646] Refactoring layout optimization (clustering) flow to support linear ordering #4606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@alexeykudinkin : is there anyone you know will review this patch or you want me to review. |
|
btw, looks like there are some CI failures. can you please check them. |
…curve partitioner as well
Deprecated superfluous settings
`TestSpaceCurveLayoutOptimization` > `TestLayoutOptimization`
…composition methods
Tidying up configs
nsivabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general. one clarification. We are not changing any config names right? I did verify, but just wanted to confirm. if yes, might have to add older config as alternative.
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java
Show resolved
Hide resolved
...n/java/org/apache/hudi/client/clustering/run/strategy/MultipleSparkJobExecutionStrategy.java
Show resolved
Hide resolved
|
@nsivabalan correct, all configs are kept and marked as deprecated. The only thing that changes is that some of them have actually no effect anymore. How should we handle this? For example
|
80899c4 to
a0b32bb
Compare
| * The more columns involved in sorting, the worse the aggregation, and the smaller the query performance improvement. | ||
| * Choose the filter columns which commonly used in query sql as sort columns. | ||
| * It is recommend that 2 ~ 4 columns participate in sorting. | ||
| * @deprecated this setting has no effect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add in documentation as to what other config(s) the user is supposed to look into instead of this deprecated one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
nsivabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. one nit on documentation about deprecating configs
Elaborated what configs users should refer to instead
|
@alexeykudinkin : I pushed a minor update to fix the build failure. |
…low to support linear ordering (apache#4606) Refactoring layout optimization (clustering) flow to - Enable support for linear (lexicographic) ordering as one of the ordering strategies (along w/ Z-order, Hilbert) - Reconcile Layout Optimization and Clustering configuration to be more congruent
…low to support linear ordering (apache#4606) Refactoring layout optimization (clustering) flow to - Enable support for linear (lexicographic) ordering as one of the ordering strategies (along w/ Z-order, Hilbert) - Reconcile Layout Optimization and Clustering configuration to be more congruent
…low to support linear ordering (apache#4606) Refactoring layout optimization (clustering) flow to - Enable support for linear (lexicographic) ordering as one of the ordering strategies (along w/ Z-order, Hilbert) - Reconcile Layout Optimization and Clustering configuration to be more congruent
…low to support linear ordering (apache#4606) Refactoring layout optimization (clustering) flow to - Enable support for linear (lexicographic) ordering as one of the ordering strategies (along w/ Z-order, Hilbert) - Reconcile Layout Optimization and Clustering configuration to be more congruent
Tips
What is the purpose of the pull request
Refactoring layout optimization (clustering) flow to
Brief change log
Verify this pull request
This pull request is already covered by existing tests, such as (please describe tests).
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.